W-TSV: Weighted topological signature vector for lexicon reduction in handwritten Arabic documents

نویسندگان

  • Youssouf Chherawala
  • Mohamed Cheriet
چکیده

This paper proposes a holistic lexicon-reduction method for ancient and modern handwritten Arabic documents. The word shape is represented by the weighted topological signature vector (W-TSV), which encodes graph data into a low-dimensional vector space. Three directed acyclic graph (DAG) representations are proposed for Arabic word shapes, based on topological and geometrical features. Lexicon reduction is achieved by a nearest neighbors search in the W-TSV space. The proposed framework has been tested on the IFN/ENIT and the Ibn Sina databases, achieving respectively a degree of reduction of 83.5% and 92.9% for an accuracy of reduction of 90%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Holistic Approach for Classifying and Retrieving Personal Arabic Handwritten Documents

This paper presents a novel holistic technique for classifying and retrieving Arabic handwritten text documents. The retrieval of Arabic handwritten documents is performed in several steps. First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts. Second, several features are extracted from these connected parts and then co...

متن کامل

Arabic word descriptor for handwritten word indexing and lexicon reduction

Word recognition systems use a lexicon to guide the recognition process in order to improve the recognition rate. However, as the lexicon grows, the computation time increases. In this paper, we present the Arabic word descriptor (AWD) for Arabic word shape indexing and lexicon reduction in handwritten documents. It is formed in two stages. First, the structural descriptor (SD) is computed for ...

متن کامل

یک روش دو مرحلهای برای بازشناسی کلمات دستنوشته فارسی به کمک بلوکبندی تطبیقی گرادیان تصویر

This paper presented a two step method for offline handwritten Farsi word recognition. In first step, in order to improve the recognition accuracy and speed, an algorithm proposed for initial eliminating lexicon entries unlikely to match the input image. For lexicon reduction, the words of lexicon are clustered using ISOCLUS and Hierarchal clustering algorithm. Clustering is based on the featur...

متن کامل

Classification of Personal Arabic Handwritten Documents

This paper presents a novel holistic technique for classifying Arabic handwritten text documents. The classification of Arabic handwritten documents is performed in several steps. First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts. Second, several structural and statistical features are extracted from these connected ...

متن کامل

Use of the Shearlet Transform and Transfer Learning in Offline Handwritten Signature Verification and Recognition

Despite the growing growth of technology, handwritten signature has been selected as the first option between biometrics by users. In this paper, a new methodology for offline handwritten signature verification and recognition based on the Shearlet transform and transfer learning is proposed. Since, a large percentage of handwritten signatures are composed of curves and the performance of a sig...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2012